Picture for Zhuosheng Zhang

Zhuosheng Zhang

Department of Computer Science and Engineering, Shanghai Jiao Tong University, Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University

MineExplorer: Evaluating Open-World Exploration of MLLM Agents in Minecraft

Add code
May 29, 2026
Viaarxiv icon

GUI-CIDER: Mid-training GUI Agents via Causal Internalization and Density-aware Exemplar Reselection

Add code
May 27, 2026
Viaarxiv icon

Mobile-Aptus: Confidence-Driven Proactive and Robust Interaction in MLLM-based Mobile-Using Agents

Add code
May 27, 2026
Viaarxiv icon

Causal Probing for Internal Visual Representations in Multimodal Large Language Models

Add code
May 07, 2026
Viaarxiv icon

OS-SPEAR: A Toolkit for the Safety, Performance,Efficiency, and Robustness Analysis of OS Agents

Add code
Apr 27, 2026
Viaarxiv icon

Generalizable Detection of AI Generated Images with Large Models and Fuzzy Decision Tree

Add code
Mar 30, 2026
Viaarxiv icon

Zooming without Zooming: Region-to-Image Distillation for Fine-Grained Multimodal Perception

Add code
Feb 16, 2026
Viaarxiv icon

Plan-MCTS: Plan Exploration for Action Exploitation in Web Navigation

Add code
Feb 15, 2026
Viaarxiv icon

Adaptive Milestone Reward for GUI Agents

Add code
Feb 12, 2026
Viaarxiv icon

Breaking the Overscaling Curse: Thinking Parallelism Before Parallel Thinking

Add code
Jan 29, 2026
Viaarxiv icon